A Text Corpora-Based Estimation of the Familiarity of Health Terminology
نویسندگان
چکیده
In a pilot effort to improve health communication we created a method for measuring the familiarity of various medical terms. To obtain term familiarity data, we recruited 21 volunteers who agreed to take medical terminology quizzes containing 68 terms. We then created predictive models for familiarly based on term occurrence in text corpora and reader’s demographics. Although the sample size was small, our preliminary results indicate that predicting the familiarity of medical terms based on an analysis of the frequency in text corpora is feasible. Further, individualized familiarity assessment is feasible when demographic features are included as predictors.
منابع مشابه
Arabic News Articles Classification Using Vectorized-Cosine Based on Seed Documents
Besides for its own merits, text classification (TC) has become a cornerstone in many applications. Work presented here is part of and a pre-requisite for a project we have overtaken to create a corpus for the Arabic text process. It is an attempt to create modules automatically that would help speed up the process of classification for any text categorization task. It also serves as a tool for...
متن کاملPolitical Terms by APLL: Issues of Terminology Implantation and Acceptability
The present study investigates the implantation of political science terminology approved by the Academy of Persian Language and Literature (APLL) in the Hamshahri corpus made up of news text from Hamshahri newspaper and their acceptability among MA students of English translation studies (ETS), English literature (EL), and Political science (PS). To conduct this research the frequencies of the...
متن کاملاستخراج پیکره موازی از اسناد قابلمقایسه برای بهبود کیفیت ترجمه در سیستمهای ترجمه ماشینی
Data used for training statistical machine translation method are usually prepared from three resources: parallel, non-parallel and comparable text corpora. Parallel corpora are an ideal resource for translation but due to lack of these kinds of texts, non-parallel and comparable corpora are used either for parallel text extraction. Most of existing methods for exploiting comparable corpora loo...
متن کاملRelationship Between Hospital Cost Based on Current Procedural Terminology and Out-of-Pocket Payment of Oil Company Retirees
Relationship Between Hospital Cost Based on Current Procedural Terminology and Out-of-Pocket Payment of Oil Company Retirees Marzie Afshoon 1, Leila Riahi 2*, Leila Nazarimanesh 3 1 Department of Health Services Management, Science and Research Branch, Islamic Azad University, Tehran, Iran Abstract Introduction: This study aimed to investigate the relationship between hospital cost based ...
متن کاملA new model for persian multi-part words edition based on statistical machine translation
Multi-part words in English language are hyphenated and hyphen is used to separate different parts. Persian language consists of multi-part words as well. Based on Persian morphology, half-space character is needed to separate parts of multi-part words where in many cases people incorrectly use space character instead of half-space character. This common incorrectly use of space leads to some s...
متن کامل